MERLIN: An Online Trilingual Learner Corpus Empirically Grounding the European Reference Levels in Authentic Learner Data

نویسندگان

  • Katrin Wisniewski
  • Karin Schöne
  • Lionel Nicolas
  • Chiara Vettori
  • Adriane Boyd
  • Detmar Meurers
  • Andrea Abel
  • Jirka Hana
چکیده

Since its publication in 2001, the Common European Framework of Reference for Languages (CEFR) has gained a leading role as an instrument of reference for language teaching and certification. Nonetheless, there is a growing concern about CEFR levels being insufficiently illustrated in terms of authentic learner data. Such concern grows even stronger when considering languages other than English (cf., e.g., Hulstijn 2007, North 2000). In this paper, we present the MERLIN project that addresses this need by illustrating and validating the CEFR levels for Czech, German, and Italian. To achieve its goal, we are developing a didactically motivated online platform to enable CEFR users to explore authentic written learner productions that have been related in a methodologically sophisticated and rigorous way to the CEFR levels. By making a significant number of learner productions freely accessible and easily searchable in a form that is richly annotated with linguistic characteristics and learner error types, the platform will assist teachers, learners, test developers, textbook authors, teacher trainers, and educational policy makers in developing a more comprehensive conceptualization of CEFR levels based on authentic learner data. In the first, methodology-oriented part of this paper, we explain how the learner textual data were collected, re-rated, transcribed, double-checked and prepared for additional manual and automatic processing. We then illustrate the indicators we built to analyze L2 productions. Indicators were derived through (a) linguistic analyses of the performance samples, (b) the operationalization of the CEFR scale descriptors, (c) the study of relevant literature on SLA and language testing, (d) textbook analyses and (e) a questionnaire study. This study allowed us to devise a harmonized annotation schema taking into account both common and language-specific features (e.g., gender/article in German, reflexive possessive pronouns in Czech, pronoun particles in Italian). In the second, application-oriented part, we explain how, by offering a large corpus of freely accessible empirical material, the project helps provide a fine-grained characterization of the CEFR levels and how it serves language teaching and learning. MERLIN thereby aims at responding to the suggestions of the Council of Europe itself, which solicits the development of supplementary tools for illustrating the CEFR levels (http://purl.org/net/CEFR-Goullier.doc). Furthermore, we explain how the platform enables the targeted users to retrieve authentic information about the relationship of the CEFR levels to a wide spectrum of well-defined, user-need-oriented L2 challenges. MERLIN users, such as teacher or learners, can thus compare their students’ or their own performances and get a clearer picture of their strengths and weaknesses. In the third, research-oriented part, we situate MERLIN with regards to two current topics in Second Language Acquisition: validation of CEFR scales and natural language processing for learner language. [This publication reports on work from the MERLIN project, funded by the European Commission (518989-LLP-1-2011-DE-KA2-KA2MP). It only reflects the views of the authors and the Commission cannot be held responsible for any use which may be made of the information contained therein.]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The MERLIN corpus: Learner language and the CEFR

The MERLIN corpus is a written learner corpus for Czech, German, and Italian that has been designed to illustrate the Common European Framework of Reference for Languages (CEFR) with authentic learner data. The corpus contains 2,290 learner texts produced in standardized language certifications covering CEFR levels A1–C1. The MERLIN annotation scheme includes a wide range of language characteri...

متن کامل

Design and Development of the MERLIN Learner Corpus Platform

In this paper we report on the design and development of an online search platform for the MERLIN1 corpus of learner texts in Czech, German and Italian. It was created in the context of the MERLIN project, which aims at empirically illustrating features of the Common European Framework of Reference (CEFR) for evaluating language competences based on authentic learner text productions compiled i...

متن کامل

ARIDA: An Arabic Interlanguage Database and Its Applications: A Pilot Study

This paper describes a pilot study in which we collected a small learner corpus of Arabic, developed a tagset for errorannotation of Arabic learner data, tagged the data for error, and performed simple Computer-aided Error Analysis (CEA). Language Learner Corpora and Applications Learner corpora research uses the methods and tools of Second Language Acquisition (SLA) studies and corpus linguist...

متن کامل

Correlation between Online Learner Readiness with Psychological Distress related to e-Learning among Nursing and Midwifery Students during COVID-19 pandemic

Introduction: With the sudden shift of face-to-face education to e-learning during the COVID-19 pandemic, awareness of learnerschr('39') readiness for online learning and its impact on studentschr('39') psychological distress related to e-learning is important for teachers, counselors, and educational planners. Therefore, the present study was conducted to investigate the correlation between on...

متن کامل

Hedges in English for Academic Purposes: A Corpus-based study of Iranian EFL learners

Hedges, as tools to express tentativeness and doubt, have been studied in plenty of research papers in the Iranian EFL research setting. However, their use in a learner corpus, portraying Iranian learner English, is in need of more research attention. With this end in view, this study aimed at investigating how Iranian EFL learners who have majored in English-related fields in Iran deployed hed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013